Construction of Vietnamese SentiWordNet by using Vietnamese Dictionary

نویسندگان

  • Xuan-Son Vu
  • Seong-Bae Park
چکیده

SentiWordNet is an important lexical resource supporting sentiment analysis in opinion mining applications. In this paper, we propose a novel approach to construct a Vietnamese SentiWordNet (VSWN). SentiWordNet is typically generated from WordNet in which each synset has numerical scores to indicate its opinion polarities. Many previous studies obtained these scores by applying a machine learning method to WordNet. However, Vietnamese WordNet is not available unfortunately by the time of this paper. Therefore, we propose a method to construct VSWN from a Vietnamese dictionary, not from WordNet. We show the effectiveness of the proposed method by generating a VSWN with 39,561 synsets automatically. The method is experimentally tested with 266 synsets with aspect of positivity and negativity. It attains a competitive result compared with English SentiWordNet that is 0.066 and 0.052 differences for positivity and negativity sets respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Follow-Up Visits in Doctor-Patient Communication: The Vietnamese Case

In a “follow-up visit”, a patient seeks medical attention for an existing health problem. Using data from the Vietnamese public hospital system, we present a more nuanced analysis of follow-ups in health communication than the one currently available. To be specific, we discriminate between “same follow-ups”, in which the doctor is the same one as in the last visit, and “different follow-ups”, ...

متن کامل

Spoken and Written Language Resources for Vietnamese

This paper presents an overview of our activities for spoken and written language resources for Vietnamese implemented at CLIPSIMAG Laboratory and International Research Center MICA. A new methodology for fast text corpora acquisition for minority languages which has been applied to Vietnamese is proposed. The first results of a process of building a large Vietnamese speech database (VNSpeechCo...

متن کامل

Implementing Project Work in Teaching English at High School: The Case of Vietnamese Teachers’ Challenges

Research on using project work in teaching various disciplines has pointed out a number of challenges facing teachers. Similar research in the EFL classroom, however, has been under-researched. This study aimed to fill the gap with a report on the Vietnamese high school teachers’ challenges in implementing project-based learning in the setting of curricular innovation in English instruction nat...

متن کامل

An Unsupervised Learning and Statistical Approach for Vietnamese Word Recognition and Segmentation

There are two main topics in this paper: (i) Vietnamese words are recognized and sentences are segmented into words by using probabilistic models; (ii) the optimum probabilistic model is constructed by an unsupervised learning processing. For each probabilistic model, new words are recognized and their syllables are linked together. The syllable-linking process improves the accuracy of statisti...

متن کامل

How does Dictionary Size Influence Performance of Vietnamese Word Segmentation?

Vietnamese word segmentation (VWS) is a challenging basic issue for natural language processing. This paper addresses the problem of how does dictionary size influence VWS performance, proposes two novel measures: square overlap ratio (SOR) and relaxed square overlap ratio (RSOR), and validates their effectiveness. The SOR measure is the product of dictionary overlap ratio and corpus overlap ra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1412.8010  شماره 

صفحات  -

تاریخ انتشار 2014